A Three-Layered Collocation Extraction Tool and Its Application in China English Studies
نویسندگان
چکیده
We design a three-layered collocation extraction tool by integrating syntactic and semantic knowledge and apply it in China English studies. The tool first extracts peripheral collocations in the frequency layer from dependency triples, then extracts semi-peripheral collocations in the syntactic layer by association measures, and last extracts core collocations in the semantic layer with a similar word thesaurus. The syntactic constraints filter out much noise from surface co-occurrences, and the semantic constraints are effective in identifying the very “core” collocations. The tool is applied to automatically extract collocations from a large corpus of China English we compile to explore how China English as a variety of English is nativilized. Then we analyze similarity and difference of the typical China English collocations of a group of verbs. The tool and results can be applied in the compilation of language resources for Chinese-English translation and corpus-based China studies.
منابع مشابه
A Comparative Analysis of Collocation in Arabic-English Translations of the Glorious Quran
The Qur’an is the only holy book of Muslims all around the world. Each person with any religion and language is interested in comprehending and accepting the rules and regulations of their own belief. Translation of the Qur’an is only an attempt to present its meaning. One of the most challenges in translation of the Qur’an is collocation. A collocation is a sequence of words or terms that co-o...
متن کاملAutomatic Extraction of English Collocations and their Chinese - English Bilingual Examples : A Computational Tool for Bilingual Lexicography
This paper describes the procedures involved in developing EXEC, a web-based system which can automatically extract English collocations and their Chinese-English bilingual examples from parallel corpora. The system draws on statistics, dependency parsing, and Chinese-English parallel corpora of more than 13 million English words and 27 million Chinese characters. By taking a word as well as th...
متن کاملA Tool for Multi-Word CoUocation Extraction and Visualization in MultUingual Corpora
This document describes an implemented system of collocation extraction which is designed as aid to translation and which will be used in a real translation environment. Its main functionalities are: retrieving multi-word collocations from an existing corpus of documents in a given language (only French and English are supported for the time being); visualizing the list of extracted terms and t...
متن کاملThe Comparison of Native English and Persian Elementary School Students’ Performance on Lexical and Grammatical Collocations
The importance and howness of language learning/ acquisition has been a great concern for decades. There are many factors that play important roles in this regard. This research compared the performance of native Persian and English elementary students to see if there is any significant difference between the two groups and which type of collocation they performed better within the groups. For ...
متن کاملPropagation of a Monopulse in Layered and Inhomogeneous Media Using an Adaptive Multiscale Wavelet Collocation Method
Figure 2: Electric eld in layered and inhomogenous(linear proole) meida. The evolution of the transmitted pulse in hte inhomogenous layer can be seen.
متن کامل